MAHBUB HOMEPAGE

Electric Vehicle Data Analysis¶


Project Title: Ranking of Electric Vehicle Based on Electric Range and the Efficiency Derived from the Car's Eligibility.


Data Source: catalog.data.gov/dataset/electric-vehicle-population-data¶

Project Description:¶

Electric vehicles have grown quite popular over the last decade due to its cleanliness and a good mileage of batteries. One of the attractive features of electric vehicles is that these are environmental friendly as these emit less air pollutants. Electricity is also available due to the contributions of the nuclear power plants and geothermal energies. This project deals with information about the electric vehicles over the last 25 years. The dataset was collected from data.gov.

Project Goals:¶

The goal of this project is to rank the electric car companies based on the mileage (electric range) provided by the cars of these companies.

Project Outlines:¶

  • Evaluation of the electric vehicles based on "Clean Alternative Fuel Vehicle Eligibility".
  • Identification of the better performing vehicle type.
  • Visualizing the electric range of the car companies over the last two decades.
  • Ranking the car companies based on eligibility efficiency and electric range (mileage).

Task Resolution:¶

Let us first inspect the dataset to view and form the dataframe. Let us start by calling import libraries and creating the dataframe.

Calling the python libraries:

In [1]:
# Importing python libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

Creating pandas DataFrame from csv file:

In [2]:
# Reading csv file and Creating dataframe

raw_data = pd.read_csv("C:/Users/mahbu/Downloads/Electric_Vehicle_Population_Data.csv")
data = pd.DataFrame(raw_data)

Showing the first and last 5 rows of the DataFrame:

In [3]:
# Showing the first 5 rows of the DataFrame

pd.set_option('display.max_colwidth', 80)
data.head(5)
Out[3]:
VIN (1-10) County City State Postal Code Model Year Make Model Electric Vehicle Type Clean Alternative Fuel Vehicle (CAFV) Eligibility Electric Range Base MSRP Legislative District DOL Vehicle ID Vehicle Location Electric Utility 2020 Census Tract
0 5YJXCAE26J Yakima Yakima WA 98908.0 2018 TESLA MODEL X Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 238 0 14.0 141151601 POINT (-120.56916 46.58514) PACIFICORP 5.307700e+10
1 JHMZC5F37M Kitsap Poulsbo WA 98370.0 2021 HONDA CLARITY Plug-in Hybrid Electric Vehicle (PHEV) Clean Alternative Fuel Vehicle Eligible 47 0 23.0 171566447 POINT (-122.64681 47.73689) PUGET SOUND ENERGY INC 5.303509e+10
2 5YJ3E1EB0K King Seattle WA 98199.0 2019 TESLA MODEL 3 Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 220 0 36.0 9426525 POINT (-122.40092 47.65908) CITY OF SEATTLE - (WA)|CITY OF TACOMA - (WA) 5.303301e+10
3 1N4AZ0CP5D King Seattle WA 98119.0 2013 NISSAN LEAF Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 75 0 36.0 211807760 POINT (-122.3684 47.64586) CITY OF SEATTLE - (WA)|CITY OF TACOMA - (WA) 5.303301e+10
4 5YJSA1E21H Thurston Lacey WA 98516.0 2017 TESLA MODEL S Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 210 0 22.0 185810306 POINT (-122.75379 47.06316) PUGET SOUND ENERGY INC 5.306701e+10
In [4]:
# Showing the last 5 rows of the DataFrame

data.tail(5)
Out[4]:
VIN (1-10) County City State Postal Code Model Year Make Model Electric Vehicle Type Clean Alternative Fuel Vehicle (CAFV) Eligibility Electric Range Base MSRP Legislative District DOL Vehicle ID Vehicle Location Electric Utility 2020 Census Tract
130438 7SAYGDEE6P Pierce Gig Harbor WA 98335.0 2023 TESLA MODEL Y Battery Electric Vehicle (BEV) Eligibility unknown as battery range has not been researched 0 0 26.0 231134102 POINT (-122.58354539999999 47.32344880000005) BONNEVILLE POWER ADMINISTRATION||CITY OF TACOMA - (WA)||PENINSULA LIGHT COMPANY 5.305307e+10
130439 1N4BZ1CV7N Pierce Tacoma WA 98408.0 2022 NISSAN LEAF Battery Electric Vehicle (BEV) Eligibility unknown as battery range has not been researched 0 0 29.0 185810943 POINT (-122.43810499999995 47.203220000000044) BONNEVILLE POWER ADMINISTRATION||CITY OF TACOMA - (WA)||PENINSULA LIGHT COMPANY 5.305306e+10
130440 5YJYGDEE8M King Seattle WA 98109.0 2021 TESLA MODEL Y Battery Electric Vehicle (BEV) Eligibility unknown as battery range has not been researched 0 0 36.0 176542418 POINT (-122.35022 47.63824) CITY OF SEATTLE - (WA)|CITY OF TACOMA - (WA) 5.303301e+10
130441 5YJXCBE22L Island Camano Island WA 98282.0 2020 TESLA MODEL X Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 293 0 10.0 102834938 POINT (-122.40049 48.23986) BONNEVILLE POWER ADMINISTRATION||PUD 1 OF SNOHOMISH COUNTY 5.302997e+10
130442 5YJ3E1EA5M Pierce Puyallup WA 98375.0 2021 TESLA MODEL 3 Battery Electric Vehicle (BEV) Eligibility unknown as battery range has not been researched 0 0 2.0 180473639 POINT (-122.30116 47.1165) BONNEVILLE POWER ADMINISTRATION||CITY OF TACOMA - (WA)||PENINSULA LIGHT COMPANY 5.305307e+10

Evaluation of the Electric Car Scenerio Based on Clean Alternative Fuel Vehicle Eligibility¶

To evaluate the electric cars eligibility scenerio, let's find out the numbers of each category cars based on clean alternative fuel vehicle eligibility.

In [5]:
# Evaluation based on CAFV eligibility

CAFV_eligibility_data = data.groupby(by=['Clean Alternative Fuel Vehicle (CAFV) Eligibility'])[['Make']].count()
CAFV_eligibility_data
Out[5]:
Make
Clean Alternative Fuel Vehicle (CAFV) Eligibility
Clean Alternative Fuel Vehicle Eligible 60551
Eligibility unknown as battery range has not been researched 53446
Not eligible due to low battery range 16446
In [6]:
# Listing the CAFV eligibility categories and finding individual count_values of each categories

CAFV_category_list = data['Clean Alternative Fuel Vehicle (CAFV) Eligibility'].unique()
CAFV_category_list.sort()
CAFV_category = {}
for list_item in CAFV_category_list:
    CAFV_category[list_item] = CAFV_eligibility_data.loc[list_item,'Make']    
print("Clean Alternative Fuel Vehicle (CAFV) Eligibility Category Values:\n{}".format(CAFV_category))

# Calculating eligibility states

Eligibility_states = {}
Eligibility_states['Known'] = (CAFV_category["Clean Alternative Fuel Vehicle Eligible"]+
                               CAFV_category["Not eligible due to low battery range"])
Eligibility_states['Unknown'] = CAFV_category["Eligibility unknown as battery range has not been researched"]
print("\nClean Alternative Fuel Vehicle (CAFV) Eligibility States: {}".format(Eligibility_states))

# Visualizing the CAFV for all electric cars

font = {'family': 'Times New Roman','color':'black','weight': 'bold','size': 14,}
font_tick = {"family":"Times New Roman","size":13,"weight":"bold","style":"normal"}
color_list = ['#17594A','#F29727','#F24C3D']

fig,(ax2, ax1) = plt.subplots(1, 2,sharey=True)
fig.suptitle('Clean Alternative Fuel Vehicle (CAFV) Eligibility For Electric Vehicles',size=15,weight='bold')
fig.set_figwidth(16)
fig.set_figheight(8)

color_index_ax1 = 0
for keys,values in CAFV_category.items():
    bar_plot_1 = ax1.bar(keys,values,label=keys,width=0.6,color=color_list[color_index_ax1])
    ax1.bar_label(bar_plot_1,padding = 3 )
    ax1.legend(prop={'weight':"normal"})
    color_index_ax1 += 1
ax1.set_title("CAFV Category Value Distribution",fontdict=font)
ax1.set_xticks(list(CAFV_category.keys()),labels=['Eligible','Unknown','Ineligible'],fontproperties=font_tick)

color_index_ax2 = 0
for key,value in Eligibility_states.items():
    bar_plot_2=ax2.bar(key,value,label=key,width=0.3,color=color_list[color_index_ax2])
    ax2.bar_label(bar_plot_2,padding=3)
    ax2.legend(prop={'weight':"normal"})
    color_index_ax2 += 1
ax2.set_title("CAFV Eligibility State Distribution",fontdict=font)
ax2.set_ylabel("No. of Cars",fontdict=font,labelpad=15)
ax2.set_yticks(np.arange(0,85000,20000),fontproperties=font_tick,labelpad=15)
ax2.set_xticks(list(Eligibility_states.keys()),labels=['Known Eligibility','Unknown Eligibility'],fontproperties=font_tick)

plt.show()
Clean Alternative Fuel Vehicle (CAFV) Eligibility Category Values:
{'Clean Alternative Fuel Vehicle Eligible': 60551, 'Eligibility unknown as battery range has not been researched': 53446, 'Not eligible due to low battery range': 16446}

Clean Alternative Fuel Vehicle (CAFV) Eligibility States: {'Known': 76997, 'Unknown': 53446}

Key Takedowns:¶

The vehicles have two states of eligibility, such as:

  1. Known
  2. Unknown.

Again, based on the known eligibility the vehicles might have two possible outputs which are:

  • Eligible
  • Ineligible.

Figuring Out the Condtion of Eligibility¶

To discover the condition on which the eligibilty has been categorized, we will find out the minimum and maximum electric range value for each of the categories. The reason behind taking this approach is that, one of the categories is termed as 'Not eligible due to low battery range'. We will create a Class which will take input of the categories and return the electric range values.

In [7]:
# Class for finding electric range

class Electric_range_converter():
    def __init__(self,dataframe,eligibility_category_name):
        self.dataframe = dataframe
        self.eligibility_category_name = eligibility_category_name
        
    def category_name_dataframe(self):
        self.category_dataframe = self.dataframe[self.dataframe['Clean Alternative Fuel Vehicle (CAFV) Eligibility']==self.eligibility_category_name]
        self.category_dataframe = self.category_dataframe[['Make','Model Year','Electric Vehicle Type','Clean Alternative Fuel Vehicle (CAFV) Eligibility','Electric Range']]
        return self.category_dataframe

    def range_value_electric_range(self):
        self.category_dataframe = self.dataframe[self.dataframe['Clean Alternative Fuel Vehicle (CAFV) Eligibility']==self.eligibility_category_name]
        self.max_value = self.category_dataframe['Electric Range'].max()
        self.min_value = self.category_dataframe['Electric Range'].min()
        self.range_value = self.max_value - self.min_value
        return "{}-{}".format(self.min_value,self.max_value)
    
    def electric_range_by_company(self,company_name):
        self.category_dataframe = self.dataframe[self.dataframe['Clean Alternative Fuel Vehicle (CAFV) Eligibility']==self.eligibility_category_name]
        self.category_dataframe = self.category_dataframe[['Make','Model Year','Electric Vehicle Type','Clean Alternative Fuel Vehicle (CAFV) Eligibility','Electric Range']]
        self.company_name = company_name
        self.category_by_company = self.category_dataframe[self.category_dataframe['Make']==self.company_name]
        self.category_by_company = self.category_by_company.reset_index(drop=True)
        return self.category_by_company

Let's find out the electric range at different eligibility condition.

In [8]:
# Checking the eligbility and electric range

Eligibile_cars = Electric_range_converter(data,'Clean Alternative Fuel Vehicle Eligible')
Ineligible_cars = Electric_range_converter(data,'Not eligible due to low battery range')
Unknown_eligibility = Electric_range_converter(data,'Eligibility unknown as battery range has not been researched')

Eligibility_Range = {}
Eligibility_Range['Eligible Cars'] = Eligibile_cars.range_value_electric_range()
Eligibility_Range['Ineligible_cars'] = Ineligible_cars.range_value_electric_range()
Eligibility_Range['Unknown_eligibility'] = Unknown_eligibility.range_value_electric_range()

print('Electric Range of Cars: {}'.format(Eligibility_Range))
Electric Range of Cars: {'Eligible Cars': '30-337', 'Ineligible_cars': '6-29', 'Unknown_eligibility': '0-0'}

So, the eligibility of the cars are based on car's electric range. The hint was given in the name Not eligible due to low battery range. Electric Range refers to how far a vehicle can go on a single charge. Electric Range is calculated by dividing the amount of energy in the batteries (kWh) by the efficiency of the vehicle (kWh/mile). It can be expresses as,

$$\text {Electric Range} = \frac {\text {Available energy in batteries (kWh)}} {\text {Vehicle efficiency (kWh/mile)}}$$

It be can illustrated from the given dataset that all the cars having electric range equal or above 30 are included in the category Clean Alternative Fuel Vehicle Eligible. In short, we can say that cars having electric range 30 or avobe are eligible cars. Similarly, cars in the electric range 1-29 are Ineligible cars due to low battery range. Finally, the cars having unknown eligibility do not have any known electric range and that is why the electric range of these cars have been termed as 0. However, the actual eligibility of these cars can be found out after research.

Identification of the Better Performing Vehicle Types¶

So far we have established that electric range of the car define its eligibility. Now, let us dig down more and try to discover what type of cars provide how much electric range. At first, let's check the eligible cars.

In [9]:
# Eligible Cars DataFrame

Eligible_cars_dataframe = Eligibile_cars.category_name_dataframe()
Eligible_cars_dataframe
Out[9]:
Make Model Year Electric Vehicle Type Clean Alternative Fuel Vehicle (CAFV) Eligibility Electric Range
0 TESLA 2018 Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 238
1 HONDA 2021 Plug-in Hybrid Electric Vehicle (PHEV) Clean Alternative Fuel Vehicle Eligible 47
2 TESLA 2019 Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 220
3 NISSAN 2013 Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 75
4 TESLA 2017 Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 210
... ... ... ... ... ...
130428 NISSAN 2018 Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 151
130429 BMW 2022 Plug-in Hybrid Electric Vehicle (PHEV) Clean Alternative Fuel Vehicle Eligible 30
130432 TESLA 2020 Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 291
130436 TESLA 2018 Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 215
130441 TESLA 2020 Battery Electric Vehicle (BEV) Clean Alternative Fuel Vehicle Eligible 293

60551 rows × 5 columns

In [10]:
# Grouping by Electric Vehicle Type (Eligible)

Eligible_cars_vehicle_type_count = Eligible_cars_dataframe.groupby(by=['Electric Vehicle Type'])[['Make']].count()
Eligible_cars_vehicle_type_count
Out[10]:
Make
Electric Vehicle Type
Battery Electric Vehicle (BEV) 46701
Plug-in Hybrid Electric Vehicle (PHEV) 13850

So, there are 46701 BEV (Battery Electric Vehicle) which have electric range equal or greater than 30 while 13850 PHEV (Plug-in Hybrid Electric Vehicle) have the same electric range. Similarly, let's inspect the ineligible cars and find out its scenerio.

In [11]:
# Grouping by Electric Vehicle Type (Ineligible)

Ineligible_cars_dataframe = Ineligible_cars.category_name_dataframe()
Ineligible_cars_vehicle_type_count = Ineligible_cars_dataframe.groupby(by=['Electric Vehicle Type'])[['Make']].count()
Ineligible_cars_vehicle_type_count
Out[11]:
Make
Electric Vehicle Type
Battery Electric Vehicle (BEV) 9
Plug-in Hybrid Electric Vehicle (PHEV) 16437

Only 9 BEV are in the ineligible electric range while 16437 PHEV are in the same range which is huge compared to BEV. So, based on the data, we can conclude that BEV are performing better than PHEV in term of electric range.

Ranking of the companies¶

For ranking the vehicle companies, we will need to find out which company created the most number of clean alternative fuel vehicles. We will also check the progress of each company's production and percentage of environmental friendly vehicle manufactured over the whole production.

In [12]:
# Checking the production case based on make over the years

Make_raw_data = data.groupby(by=['Make'])[['Model']].count()
Make_raw_data
Out[12]:
Model
Make
AUDI 2622
AZURE DYNAMICS 7
BENTLEY 3
BMW 5696
CADILLAC 103
CHEVROLET 11251
CHRYSLER 2139
FIAT 803
FISKER 14
FORD 6743
GENESIS 54
HONDA 791
HYUNDAI 2144
JAGUAR 222
JEEP 2328
KIA 5252
LAND ROVER 41
LEXUS 64
LINCOLN 208
LUCID 133
MERCEDES-BENZ 711
MINI 728
MITSUBISHI 710
NISSAN 13023
POLESTAR 648
PORSCHE 936
RIVIAN 1612
SMART 276
SUBARU 231
TESLA 59629
TH!NK 3
TOYOTA 4770
VOLKSWAGEN 3432
VOLVO 2891
WHEEGO ELECTRIC CARS 3

For analysis, we will take a sample of the production companies which have produced more than 500 cars. Since the companies produced more cars, the number of enviromental friendly cars (CAFV eligible) is assumbly bigger in those companies.

In [13]:
# Taking data of the companies having more than 500 productions

Make_larger_500_data = Make_raw_data[Make_raw_data['Model']>500]
In [14]:
# Extracting data for environmental friendly vehicle analysis

Larger_500_production_companies_list = list(Make_larger_500_data.index.values)
print("Larger Production Companies: {}\n".format(Larger_500_production_companies_list))

Make_data = data[data['Make'].isin(Make_larger_500_data.index.values)]

Each_company_production_details = {}
for company in Larger_500_production_companies_list:
    Each_company_production_details[company] = Make_data[Make_data['Make']==company][['Make','Model Year','Electric Vehicle Type','Clean Alternative Fuel Vehicle (CAFV) Eligibility','Electric Range']]
Larger Production Companies: ['AUDI', 'BMW', 'CHEVROLET', 'CHRYSLER', 'FIAT', 'FORD', 'HONDA', 'HYUNDAI', 'JEEP', 'KIA', 'MERCEDES-BENZ', 'MINI', 'MITSUBISHI', 'NISSAN', 'POLESTAR', 'PORSCHE', 'RIVIAN', 'TESLA', 'TOYOTA', 'VOLKSWAGEN', 'VOLVO']

Visualizing Electric Range¶

Let us visualize the electric range of the top electric car producing companies.

In [15]:
# Visualization

# Specifying the fonts and fontproperties for title,labels and ticks
font = {'family': ['Times New Roman','sherif'],'color':  'black','weight': 'bold','size': 14,}
font_tick = {"family":"Times New Roman","size":12,"weight":"heavy","style":"normal"}

# Plotting the result_data
fig,(ax1,ax2)=plt.subplots(2,1)
fig.set_figwidth(16)
fig.set_figheight(12)
fig.suptitle("Electric Range Over Last 25 Years",size=15,weight='bold',y=0.92)

scatter_plot1 = []
scatter_plot2 = []

for company_index in range(len(Larger_500_production_companies_list)):
    condition_1 = (data['Make']==Larger_500_production_companies_list[company_index])
    condition_2 = (data['Clean Alternative Fuel Vehicle (CAFV) Eligibility'] !='Eligibility unknown as battery range has not been researched')
    conditions = condition_1 & condition_2
    if company_index < 12:
        electric_range_dataframe = data[condition_1]
        scatter_plot1.append(ax1.scatter(electric_range_dataframe.loc[:,'Model Year'],
                                    electric_range_dataframe.loc[:,'Electric Range'],alpha=0.8))
    else:
        electric_range_dataframe = data[conditions]
        scatter_plot2.append(ax2.scatter(electric_range_dataframe.loc[:,'Model Year'],
                                    electric_range_dataframe.loc[:,'Electric Range'],alpha=0.8))
                           
ax1.set_ylabel("Electric Range",fontdict=font,labelpad=15)
ax1.set_xticks(np.arange(1996,2024,3),fontproperties=font_tick)
ax1.set_yticks(np.arange(0,360,30),fontproperties=font_tick)
ax1.legend(Larger_500_production_companies_list[0:12])

ax2.set_ylabel("Electric Range",fontdict=font,labelpad=15)
ax2.set_xlabel("Model Year",fontdict=font,labelpad=10)
ax2.set_xticks(np.arange(1996,2024,3),fontproperties=font_tick)
ax2.set_yticks(np.arange(0,360,30),fontproperties=font_tick)
ax2.legend(Larger_500_production_companies_list[12:])

plt.show()

CHEVROLET was the first company to produce an electric car. CHEVROLET was followed by FORD and TOYOTA. However, the electric range increased significantly when TESLA started producing electric car in 2008. That is one of the reasons why TESLA cars are so popular among the electric cars.

From the graph, it is clearly visible that car production took a huge step after 2011. All the car companies produced significant share of their cars from 2011 onwards. Also, the electric range of the cars is distributed within the range 0-337. These data only represent the cars whose eligibility is known. Without knowing the battery range, it would not be wise to include the unknown eligibility data during ranking. Hence, while ploting the graph, those data have been eliminated. Taking all these observations into considerations, we will now make a model to rank the electric cars.

Ranking Model:¶

For ranking purpose, we will divide the overall timezone into two portions which are the ranking score before 2011 and the ranking score after 2011. When we would use the phraseafter 2011, we would refer to from 2011 until 2023 where 2011 is included. The reason behind dividing the timezone is we are assuming that, the technologies used before 2011 were not so advanced and hence, we will give lower weight to those scores. So, giving weight values $w_{b} = 0.2$ to the ranking score before 2011 and $w_{a} = 0.8$ to the ranking score after 2011, we can express the overall ranking score $R_{overall}$ as,

$$ R_{overall} = R_{b}*w_{b} + R_{a}*w_{a} = 0.2R_{b} + 0.8R_{a}$$

where, $R_{b}$ = Ranking score before 2011 and $R_{a}$ = Ranking score after 2011

We will take two parameters, electric range score of the car, $ER$ and the eligibility efficiency of the company, $\large \eta_{e}$ to calculate $R_{x}$. Here, $R_{x}$ refers to generalized ranking score term ( It will be $R_{b}$ when calculated using the data beore 2011 and $R_{a}$ when calculated using the data after 2011 ).We will determine $R_{x}$ by summing up electric range score and eligibility efficiency as follows,

$$ R_{x} = ( ER + 4 * {\large \eta_{e}} ) $$

Since, the $ER$ will be scored on a scale of 6.0, we have multiplied ${\large \eta_{e}}$ by 4 so that $ R_{x}$ is calculated on a scale of 10.0. The electric range score of an individual car, $ER_{i}$ will be calculated using the following matrix:

$$\begin{bmatrix} \text{Electric Range} \\0-29 \\ 30-59 \\ 60-89 \\ 90-119 \\ 120-149 \\ 150-179 \\ 180-209 \\ 210-239 \\ 240-269 \\ 270-299 \\ 300-329 \\ 330-359 \end{bmatrix}\ = \begin{bmatrix} \text{Score} \\ 0 \\ 0.5 \\ 1.0 \\ 1.5 \\ 2.0 \\ 2.5 \\ 3.0 \\ 3.5 \\ 4.0 \\ 4.5 \\ 5.0 \\ 6.0 \end{bmatrix} $$

The overall electric range score of a company will be determined using,( average of all cars' electric range score ) $$ER = \frac {\sum \limits_{i=1}^{n} ER_{i}} {n}$$

Now, the eligibility efficiency of the company will be calculated as,

$${\large \eta_{e}} = \frac {\text{No. of eligible cars produced}} {\text{No. of eligible cars produced + No. of inelligible cars produced}} $$

Finally, the overall ranking will be average of ranking before 2011 and ranking after 2011 which can be expressed as,

$$ R_{overall} = \frac {R_{b}+R_{a}}{2}$$

We will rank all the companies having car productions more than 500. So, let's create a class for convenience of computation.

In [16]:
# Ranking Score Class

class Ranking_Score():
    def __init__(self,dataframe,company_name,timezone):
        self.company = company_name
        if timezone == 'before':
            self.dataframe = dataframe[dataframe['Model Year'] < 2011][['Make','Model Year','Clean Alternative Fuel Vehicle (CAFV) Eligibility','Electric Range']]
        elif timezone == 'after':
            self.dataframe = dataframe[dataframe['Model Year'] > 2010][['Make','Model Year','Clean Alternative Fuel Vehicle (CAFV) Eligibility','Electric Range']]
        else:
            print("Please enter 'before' or 'after' as the third variable.")
        
    def Electric_range_score(self):
        self.df_condition_1 = (self.dataframe['Make']== self.company)
        self.df_condition_2 = (self.dataframe['Clean Alternative Fuel Vehicle (CAFV) Eligibility'] !='Eligibility unknown as battery range has not been researched')
        self.cal_cond = self.df_condition_1 & self.df_condition_2
        self.calculation_dataframe = self.dataframe[self.cal_cond][['Make','Electric Range']]                
        conditions = [
            (self.calculation_dataframe['Electric Range']>0)&(self.calculation_dataframe['Electric Range']<30),
            (self.calculation_dataframe['Electric Range']>29)&(self.calculation_dataframe['Electric Range']<60),
            (self.calculation_dataframe['Electric Range']>59)&(self.calculation_dataframe['Electric Range']<90),
            (self.calculation_dataframe['Electric Range']>89)&(self.calculation_dataframe['Electric Range']<120),
            (self.calculation_dataframe['Electric Range']>119)&(self.calculation_dataframe['Electric Range']<150),
            (self.calculation_dataframe['Electric Range']>149)&(self.calculation_dataframe['Electric Range']<180),
            (self.calculation_dataframe['Electric Range']>179)&(self.calculation_dataframe['Electric Range']<210),
            (self.calculation_dataframe['Electric Range']>209)&(self.calculation_dataframe['Electric Range']<240),
            (self.calculation_dataframe['Electric Range']>239)&(self.calculation_dataframe['Electric Range']<270),
            (self.calculation_dataframe['Electric Range']>269)&(self.calculation_dataframe['Electric Range']<300),
            (self.calculation_dataframe['Electric Range']>299)&(self.calculation_dataframe['Electric Range']<330),
            (self.calculation_dataframe['Electric Range']>329)&(self.calculation_dataframe['Electric Range']<360),
        ]        
        choices = [0.0,0.5,1.0,1.5,2.0,2.5,3.0,3.5,4.0,4.5,5.0,6.0]
        self.calculation_dataframe['Score'] = np.select(conditions,choices,default=0.0)
        if self.calculation_dataframe['Score'].count()!= 0:
            self.score = sum(self.calculation_dataframe['Score'])/self.calculation_dataframe.shape[0]
        else:
            self.score = 0.0
        return self.calculation_dataframe,self.score
    
    def Eligibility_efficiency(self):
        self.cal_dataframe = self.dataframe[self.dataframe['Make']== self.company]
        self.numerator = self.cal_dataframe[(self.cal_dataframe['Clean Alternative Fuel Vehicle (CAFV) Eligibility']=='Clean Alternative Fuel Vehicle Eligible')]['Make'].count()
        self.denominator = self.cal_dataframe[(self.cal_dataframe['Clean Alternative Fuel Vehicle (CAFV) Eligibility']=='Clean Alternative Fuel Vehicle Eligible')]['Make'].count()+self.cal_dataframe[self.cal_dataframe['Clean Alternative Fuel Vehicle (CAFV) Eligibility']=='Not eligible due to low battery range']['Make'].count()
        if self.denominator != 0:
            self.eligibility_efficiency = self.numerator / self.denominator
        else:
            self.eligibility_efficiency = 0.0
        return self.eligibility_efficiency, self.denominator
        

Let us cross-validate our model using a small dataset.

In [17]:
# Cross Validation of the Ranking_Score class

Tesla_cal = Ranking_Score(data,'TESLA','after')
Tesla_dataframe,score = Tesla_cal.Electric_range_score()
Tesla_cal.Electric_range_score()
eff, num = Tesla_cal.Eligibility_efficiency()
print('Efficiency: {:0.3f}'.format(eff))
print('Electric_range_score: {:0.3f}'.format(score))
print('Ranking_score: {:0.3f}'.format(4*eff+score))
Tesla_dataframe
Efficiency: 1.000
Electric_range_score: 3.784
Ranking_score: 7.784
Out[17]:
Make Electric Range Score
0 TESLA 238 3.5
2 TESLA 220 3.5
4 TESLA 210 3.5
9 TESLA 208 3.0
12 TESLA 308 5.0
... ... ... ...
130417 TESLA 291 4.5
130418 TESLA 208 3.0
130432 TESLA 291 4.5
130436 TESLA 215 3.5
130441 TESLA 293 4.5

25731 rows × 3 columns

The model works fine. Let's rank the companies using the Ranking Model.

Ranking after 2011:

In [18]:
# Ranking of the companies after 2011 using Ranking Score

Company_ranking_score_after_2011 = []
Company_range_score_after_2011 = []
Company_efficiency_after_2011 = []
Company_efficiency_score_after_2011 = []
Company_production_after_2021 = []

for company in Larger_500_production_companies_list:
    company_instance = Ranking_Score(data,company,'after')
    company_dataframe,company_electric_range_score = company_instance.Electric_range_score()
    Company_range_score_after_2011.append(round(company_electric_range_score,3))
    company_efficiency,company_production_number = company_instance.Eligibility_efficiency()
    Company_production_after_2021.append(company_production_number)
    Company_efficiency_after_2011.append(round(company_efficiency,3))
    Company_efficiency_score_after_2011.append(round(4*company_efficiency,3))
    company_ranking = (company_efficiency*4+company_electric_range_score)
    Company_ranking_score_after_2011.append(round(company_ranking,3))

Company_dict_after_2011 = {'Company':Larger_500_production_companies_list,
                           'Cars With Known Eligibility':Company_production_after_2021,
                           'Eligibility Efficiency':Company_efficiency_after_2011,
                           'Efficiency Score':Company_efficiency_score_after_2011,
                           'Range Score':Company_range_score_after_2011,
                           'Ranking Score':Company_ranking_score_after_2011}
print('Ranking After 2011:')
Ranking_from_2011_until_2023 = pd.DataFrame(Company_dict_after_2011)
Ranking_from_2011_until_2023 = Ranking_from_2011_until_2023.sort_values(by=['Ranking Score'],ascending=False)
Ranking_from_2011_until_2023.loc[Ranking_from_2011_until_2023['Cars With Known Eligibility']==0,['Eligibility Efficiency','Efficiency Score','Range Score','Ranking Score']]='N\A'
Ranking_from_2011_until_2023.reset_index(drop=True)   
    
Ranking After 2011:
Out[18]:
Company Cars With Known Eligibility Eligibility Efficiency Efficiency Score Range Score Ranking Score
0 TESLA 25731 1.0 4.0 3.784 7.784
1 POLESTAR 104 1.0 4.0 3.5 7.5
2 CHEVROLET 8653 1.0 4.0 1.792 5.792
3 VOLKSWAGEN 1035 1.0 4.0 1.568 5.568
4 NISSAN 11035 1.0 4.0 1.478 5.478
5 FIAT 803 1.0 4.0 1.0 5.0
6 CHRYSLER 2139 1.0 4.0 0.5 4.5
7 HONDA 791 0.989 3.954 0.494 4.449
8 KIA 3116 0.732 2.928 1.38 4.309
9 HYUNDAI 585 0.632 2.53 1.633 4.163
10 BMW 4934 0.653 2.611 0.625 3.237
11 AUDI 1895 0.321 1.285 1.017 2.303
12 MINI 325 0.388 1.551 0.582 2.132
13 PORSCHE 663 0.3 1.201 0.9 2.101
14 VOLVO 2150 0.293 1.172 0.147 1.319
15 TOYOTA 4725 0.281 1.126 0.153 1.279
16 MERCEDES-BENZ 360 0.247 0.989 0.247 1.236
17 MITSUBISHI 710 0.256 1.025 0.173 1.198
18 FORD 3900 0.136 0.544 0.104 0.648
19 RIVIAN 0 N\A N\A N\A N\A
20 JEEP 2328 0.0 0.0 0.0 0.0

Ranking before 2011:

In [19]:
# Ranking of the companies before 2011 using Ranking Score

Company_ranking_score_before_2011 = []
Company_range_score_before_2011 = []
Company_efficiency_before_2011 = []
Company_efficiency_score_before_2011 = []
Cars_production_before_2011 = []

for company in Larger_500_production_companies_list:
    company_instance = Ranking_Score(data,company,'before')
    company_dataframe,company_electric_range_score = company_instance.Electric_range_score()
    Company_range_score_before_2011.append(round(company_electric_range_score,3))
    company_efficiency, cars_produced = company_instance.Eligibility_efficiency()
    Cars_production_before_2011.append(cars_produced)
    Company_efficiency_before_2011.append(round(company_efficiency,3))
    Company_efficiency_score_before_2011.append(round(4*company_efficiency,3))
    company_ranking = (company_efficiency*4+company_electric_range_score)
    Company_ranking_score_before_2011.append(round(company_ranking,3))

Company_dict_before_2011 = {'Company':Larger_500_production_companies_list,
                            'Cars With Known Eligibility':Cars_production_before_2011,
                           'Eligibility Efficiency':Company_efficiency_before_2011,
                           'Efficiency Score':Company_efficiency_score_before_2011,
                           'Range Score':Company_range_score_before_2011,
                           'Ranking Score':Company_ranking_score_before_2011}
print('Ranking Before 2011:')
Ranking_before_2011 = pd.DataFrame(Company_dict_before_2011)
Ranking_before_2011 = Ranking_before_2011.sort_values(by=['Ranking Score'],ascending=False)
Ranking_before_2011.loc[Ranking_before_2011['Cars With Known Eligibility']==0,['Eligibility Efficiency','Efficiency Score','Range Score','Ranking Score']]='N\A'
Ranking_before_2011.reset_index(drop=True)
Ranking Before 2011:
Out[19]:
Company Cars With Known Eligibility Eligibility Efficiency Efficiency Score Range Score Ranking Score
0 TESLA 42 1.0 4.0 3.75 7.75
1 TOYOTA 3 1.0 4.0 1.5 5.5
2 FORD 14 1.0 4.0 0.643 4.643
3 CHEVROLET 1 1.0 4.0 0.5 4.5
4 AUDI 0 N\A N\A N\A N\A
5 MITSUBISHI 0 N\A N\A N\A N\A
6 VOLKSWAGEN 0 N\A N\A N\A N\A
7 RIVIAN 0 N\A N\A N\A N\A
8 PORSCHE 0 N\A N\A N\A N\A
9 POLESTAR 0 N\A N\A N\A N\A
10 NISSAN 0 N\A N\A N\A N\A
11 MERCEDES-BENZ 0 N\A N\A N\A N\A
12 MINI 0 N\A N\A N\A N\A
13 BMW 0 N\A N\A N\A N\A
14 KIA 0 N\A N\A N\A N\A
15 JEEP 0 N\A N\A N\A N\A
16 HYUNDAI 0 N\A N\A N\A N\A
17 HONDA 0 N\A N\A N\A N\A
18 FIAT 0 N\A N\A N\A N\A
19 CHRYSLER 0 N\A N\A N\A N\A
20 VOLVO 0 N\A N\A N\A N\A

From the ranking dataframe before 2011, it can be depicted that only 4 companies produced electric cars before 2011. TESLA taking the lead with a production of 42 cars with known eligibility also have a pretty high ranking score of 7.75. FORD also produced 14 cars with a ranking score of 4.463.

Overall Ranking¶

We will now find the overall ranking using the weighted formula:

$$ R_{overall} = R_{b}*w_{b} + R_{a}*w_{a} = 0.2R_{b} + 0.8R_{a}$$

For those companies which did not produce any cars before 2011, we will consider $ R_{overall} = R_{a}$ and calculate the overall ranking.

In [20]:
# Finding the overall ranking

Company_dict = {'Company':Larger_500_production_companies_list,
                'Cars Before 2011':Cars_production_before_2011,
               'Ranking Score Before 2011':Company_ranking_score_before_2011,
                'Cars After 2011':Company_production_after_2021,
               'Ranking Score After 2011':Company_ranking_score_after_2011
               }

Overall_Ranking = pd.DataFrame(Company_dict)
print('Overall Ranking:')
Overall_Ranking['Overall Ranking Score'] = round((Overall_Ranking['Ranking Score Before 2011']*0.2+ Overall_Ranking['Ranking Score After 2011']*0.8),3)
Overall_Ranking = Overall_Ranking.sort_values(by=['Overall Ranking Score'], ascending=False)

Overall_Ranking.loc[Overall_Ranking['Cars Before 2011']==0,'Ranking Score Before 2011']='N\A'
Overall_Ranking.loc[Overall_Ranking['Cars After 2011']==0,'Ranking Score After 2011']='N\A'

Overall_Ranking.loc[(Overall_Ranking['Ranking Score Before 2011']=='N\A')&(Overall_Ranking['Ranking Score After 2011']=='N\A'),'Overall Ranking Score']='N\A'
Overall_Ranking.loc[(Overall_Ranking['Ranking Score Before 2011']=='N\A')&(Overall_Ranking['Ranking Score After 2011']!='N\A'),['Overall Ranking Score']] = Overall_Ranking['Ranking Score After 2011']
Overall_Ranking.loc[(Overall_Ranking['Ranking Score Before 2011']!='N\A')&(Overall_Ranking['Ranking Score After 2011']=='N\A'),['Overall Ranking Score']] = Overall_Ranking['Ranking Score Before 2011']

Overall_Ranking.reset_index(drop=True)
Overall Ranking:
Out[20]:
Company Cars Before 2011 Ranking Score Before 2011 Cars After 2011 Ranking Score After 2011 Overall Ranking Score
0 TESLA 42 7.75 25731 7.784 7.777
1 POLESTAR 0 N\A 104 7.5 7.5
2 CHEVROLET 1 4.5 8653 5.792 5.534
3 VOLKSWAGEN 0 N\A 1035 5.568 5.568
4 NISSAN 0 N\A 11035 5.478 5.478
5 FIAT 0 N\A 803 5.0 5.0
6 CHRYSLER 0 N\A 2139 4.5 4.5
7 HONDA 0 N\A 791 4.449 4.449
8 KIA 0 N\A 3116 4.309 4.309
9 HYUNDAI 0 N\A 585 4.163 4.163
10 BMW 0 N\A 4934 3.237 3.237
11 TOYOTA 3 5.5 4725 1.279 2.123
12 AUDI 0 N\A 1895 2.303 2.303
13 MINI 0 N\A 325 2.132 2.132
14 PORSCHE 0 N\A 663 2.101 2.101
15 FORD 14 4.643 3900 0.648 1.447
16 VOLVO 0 N\A 2150 1.319 1.319
17 MERCEDES-BENZ 0 N\A 360 1.236 1.236
18 MITSUBISHI 0 N\A 710 1.198 1.198
19 RIVIAN 0 N\A 0 N\A N\A
20 JEEP 0 N\A 2328 0.0 0.0

Authors:¶

Md. Mahbub Talukder,
BSc. in Mechanical Engineering,
Bangladesh University of Engineering and Technology, Bangladesh.

Musarrat Bintay Hossain,
BSc. in Computer Science and Technology,
Changsha University of Science and Technology, China.